---
title: Running Models Locally
---

You can run open-source LLMs and vision models on your own machine using Cua, without relying on cloud APIs. This is ideal for development, privacy, or running on air-gapped systems.

## Hugging Face (transformers)

Use the `huggingface-local/` prefix to run any Hugging Face model locally via the `transformers` library. This supports most text and vision models from the Hugging Face Hub.

**Example:**

```python
model = "huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B"
```

## MLX (Apple Silicon)

Use the `mlx/` prefix to run models using the `mlx-vlm` library, optimized for Apple Silicon (M1/M2/M3). This allows fast, local inference for many open-source models.

**Example:**

```python
model = "mlx/mlx-community/UI-TARS-1.5-7B-6bit"
```

## Ollama

Use the `ollama_chat/` prefix to run models using the `ollama` library. This allows fast, local inference for many open-source models.

**Example:**

```python
model = "omniparser+ollama_chat/llama3.2:latest"
```
